智能论文笔记

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Objective Surgical Skills Assessment and Tool Localization: Results from the MICCAI 2021 SimSurgSkill Challenge

Aneeq Zia , Kiran Bhattacharyya , Xi Liu , Ziheng Wang , Max Berniker , Satoshi Kondo , Emanuele Colleoni , Dimitris Psychogyios , Yueming Jin , Jinfan Zhou

分类：计算机视觉

2022-12-08

Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from https://console.cloud.google.com/storage/browser/isi-simsurgskill-2021.

translated by 谷歌翻译

Oracle-free Reinforcement Learning in Mean-Field Games along a Single Sample Path

Muhammad Aneeq uz Zaman , Alec Koppel , Sujay Bhatt , Tamer Başar

分类：机器学习

2022-08-24

我们考虑在平均场比赛中在线加强学习。与现有作品相反，我们通过开发一种使用通用代理的单个样本路径来估算均值场和最佳策略的算法来减轻对均值甲骨文的需求。我们称此沙盒学习为其，因为它可以用作在多代理非合作环境中运行的任何代理商的温暖启动。我们采用了两种时间尺度的方法，在该方法中，平均场的在线固定点递归在较慢的时间表上运行，并与通用代理更快的时间范围内的控制策略更新同时进行。在足够的勘探条件下，我们提供有限的样本收敛保证，从平均场和控制策略融合到平均场平衡方面。沙盒学习算法的样本复杂性为$ \ Mathcal {o}（\ epsilon^{ - 4}）$。最后，我们从经验上证明了沙盒学习算法在交通拥堵游戏中的有效性。

translated by 谷歌翻译

HTML版本

Timestamp-Supervised Action Segmentation with Graph Convolutional Networks

Hamza Khan , Sanjay Haresh , Awais Ahmed , Shakeeb Siddiqui , Andrey Konin , M. Zeeshan Zia , Quoc-Huy Tran

分类：计算机视觉

2022-06-30

我们介绍了一种新颖的方法，用于使用时间戳监督进行时间戳分割。我们的主要贡献是图形卷积网络，该网络以端到端方式学习，以利用相邻帧之间的帧功能和连接，以从稀疏的时间戳标签中生成密集的框架标签。然后可以使用生成的密集框架标签来训练分割模型。此外，我们为分割模型和图形卷积模型进行交替学习的框架，该模型首先初始化，然后迭代地完善学习模型。在四个公共数据集上进行了详细的实验，包括50种沙拉，GTEA，早餐和桌面组件，表明我们的方法优于多层感知器基线，同时在时间活动中表现出色或更好地表现出色或更好在时间戳监督下。

translated by 谷歌翻译

Regression of high dimensional angular momentum states of light

Danilo Zia , Riccardo Checchinato , Alessia Suprano , Taira Giordani , Emanuele Polino , Luca Innocenti , Alessandro Ferraro , Mauro Paternostro , Nicolò Spagnolo , Fabio Sciarrino

分类：机器学习

2022-06-20

光的轨道角动量（OAM）是一种无限维度的光自由度，在经典和量子光学元件中都有多种应用。但是，为了充分利用OAM状态的潜力，需要在实验条件下表征生成状态的可靠检测平台。在这里，我们提出了一种方法，可以通过测量其产生的空间强度分布来重建输入OAM状态。为了消除Laguerre-Gauss模式的固有对称性引起的问题，我们每个状态仅在两个不同的基础上投射它，这是如何从收集的数据中唯一恢复输入状态的。我们的方法是基于通过主成分分析和线性回归降低维度的合并应用，因此在培训和测试阶段的计算成本较低。我们在真实的光子设置中展示了我们的方法，通过量子行动动力学生成最新的OAM状态。演示方法的高性能和多功能性使其成为表征量子信息协议中高维状态的理想工具。

translated by 谷歌翻译

Neural Stochastic Dual Dynamic Programming

Hanjun Dai , Yuan Xue , Zia Syed , Dale Schuurmans , Bo Dai

分类：机器学习 | (统计)机器学习

2021-12-01

随机双动态编程（SDDP）是一种用于解决多级随机优化的最新方法，广泛用于建模现实世界流程优化任务。不幸的是，SDDP具有最糟糕的复杂性，其在决策变量的数量中呈指数级级别，这严重限制了仅限于低维度问题的适用性。为了克服这一限制，我们通过引入培训神经模型来扩展SDDP，该培训神经模型将在内部低维空间内将问题实例映射到内在的低维空间内的分型线性值函数，该尺寸低维空间是专门用于与基础SDDP求解器进行交互的架构成型，因此可以在新实例上加速优化性能。通过解决连续问题，提出的神经随机双动态编程（$ \ nu $ -sddp）不断自我提高。实证调查表明，$ \ nu $ -sddp可以显着降低解决问题的问题，而不会在一系列合成和实际过程优化问题上牺牲竞争对手的解决方案质量。

translated by 谷歌翻译

Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering

Sateesh Kumar , Sanjay Haresh , Awais Ahmed , Andrey Konin , M. Zeeshan Zia , Quoc-Huy Tran

分类：计算机视觉

2021-05-27

我们为无监督活动分割提出了一种新方法，它使用视频帧聚类作为借口任务，并同时执行表示学习和在线群集。这与先前作品相反，其中通常顺序地执行表示学习和聚类。我们通过采用时间最优运输来利用视频中的时间信息。特别是，我们纳入了一个时间正则化术语，其将活动的时间顺序保留到用于计算伪标签群集分配的标准最佳传输模块中。时间最优传输模块使我们的方法能够学习无监督活动细分的有效陈述。此外，先前的方法需要在以离线方式培养它们之前对整个数据集的学习功能存储在整个数据集中，而我们的方法在在线方式一次处理一个迷你批次。在三个公共数据集，即50沙拉，YouTube说明和早餐以及我们的数据集，即桌面装配的广泛评估表明，我们的方法在PAR或更优于以前的无监督活动分割方法，尽管内存限制显着较低。

translated by 谷歌翻译

Hyperspectral Image Super-Resolution in Arbitrary Input-Output Band Settings

Zhongyang Zhang , Zhiyang Xu , Zia Ahmed , Asif Salekin , Tauhidur Rahman

分类：计算机视觉

2021-03-19

具有窄光谱带的高光谱图像（HSI）可以捕获丰富的光谱信息，但它在该过程中牺牲其空间分辨率。最近提出了许多基于机器学习的HSI超分辨率（SR）算法。然而，这些方法的基本限制之一是它们高度依赖于图像和相机设置，并且只能学会用另一个特定设置用一个特定的设置映射输入的HSI。然而，由于HSI相机的多样性，不同的相机捕获具有不同光谱响应函数和频带编号的图像。因此，现有的基于机器学习的方法无法学习用于各种输入输出频带设置的超声波HSIS。我们提出了一种基于元学习的超分辨率（MLSR）模型，其可以在任意数量的输入频带'峰值波长下采用HSI图像，并产生具有任意数量的输出频带'峰值波长的SR HSIS。我们利用NTIRE2020和ICVL数据集训练并验证MLSR模型的性能。结果表明，单个提出的模型可以在任意输入 - 输出频带设置下成功生成超分辨的HSI频段。结果更好或至少与在特定输入输出频带设置上单独培训的基线相当。

translated by 谷歌翻译

Domain-Specific Priors and Meta Learning for Few-Shot First-Person Action Recognition

Huseyin Coskun , Zeeshan Zia , Bugra Tekin , Federica Bogo , Nassir Navab , Federico Tombari , Harpreet Sawhney

分类：计算机视觉

2019-07-22

具有注释的缺乏大规模的真实数据集使转移学习视频活动的必要性。我们的目标是为少数行动分类开发几次拍摄转移学习的有效方法。我们利用独立培训的本地视觉提示来学习可以从源域传输的表示，该源域只能使用少数示例来从源域传送到不同的目标域。我们使用的视觉提示包括对象 - 对象交互，手掌和地区内的动作，这些地区是手工位置的函数。我们采用了一个基于元学习的框架，以提取部署的视觉提示的独特和域不变组件。这使得能够在使用不同的场景和动作配置捕获的公共数据集中传输动作分类模型。我们呈现了我们转让学习方法的比较结果，并报告了阶级阶级和数据间数据间际传输的最先进的行动分类方法。

translated by 谷歌翻译